An Effective Speech Understanding Method with a Multiple Speech Recognizer based on Output Selection using Edit Distance
نویسندگان
چکیده
In this paper, we propose a simple and effective method for speech understanding. The method incorporates some speech recognizers. We use two recognizers, a large vocabulary continuous speech recognizer and a domain-specific speech recognizer. The integrated recognizer is a robust and flexible method for speech understanding. For the integration process, we use a simple edit distance measure of each output sentence from each recognizer. Our method has high scalability and accuracy. The experimental results show the effectiveness of the proposed method.
منابع مشابه
Matching Inconsistently Spelled Names in Automatic Speech Recognizer Output for Information Retrieval
Many proper names are spelled inconsistently in speech recognizer output, posing a problem for applications where locating mentions of named entities is critical. We model the distortion in the spelling of a name due to the speech recognizer as the effect of a noisy channel. The models follow the framework of the IBM translation models. The model is trained using a parallel text of closed capti...
متن کاملAn effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation
This paper proposes an effective feature compensation scheme to address a real-life situation where clean speech database is not available for Gaussian Mixture Model (GMM) training for a model-based feature compensation method. The proposed scheme employs a Support Vector Machine (SVM)based model selection method to effectively generate the GMM for our feature compensation method directly from ...
متن کاملCombination of 3 Types of Speech Recognizers for Anaphora Resolution
In this paper, we propose a method for anaphora resolution in speech understanding for a livelihood support robot. For robust speech recognition, we combine two types of speech recognizers; a large vocabulary continuous speech recognizer (LVCSR) and domain-specific speech recognizers (DSSR). One problem in the anaphora resolution is lack of the antecedent in the outputs. To solve the problem, w...
متن کاملمدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی
In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the STI and SII or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic ...
متن کاملConstructing Acoustic Distances Between Subwords and States Obtained from a Deep Neural Network for Spoken Term Detection
The detection of out-of-vocabulary (OOV) query terms is a crucial problem in spoken term detection (STD), because OOV query terms are likely. To enable search of OOV query terms in STD systems, a query subword sequence is compared with subword sequences generated using an automatic speech recognizer against spoken documents. When comparing two subword sequences, the edit distance is a typical d...
متن کامل